Blind segmentation of a multi-speaker conversation using two different sets of features
نویسنده
چکیده
An algorithm for labeling a two-speaker phone call according to the active speaker at each time frame is presented. The algorithm is based on clustering audio frames according to one features set, and then modeling speakers for each cluster and resegmenting iteratively, over a different features set. The first clustering stage is expected to yield clusters that contain audio of both speakers grouped according to the phonetic parts of speech. The second stage is expected to separate each of those clusters according to speakers, when the textual content of each cluster is more uniform. The methods to measure algorithm performance for blind segmentation task are discussed. The algorithm performance is tested and measured over conversations from the SPIDRE database[9].
منابع مشابه
Evolutive speaker segmentation using a repository system
When performing blind speaker segmentation one of the main problems is not knowing how many speakers appear in a conversation and wether they appear once or more than once. In this paper, an iterative method, which is based on the EvolutiveHMM is presented. Two main improvements to this system are introduced. On one hand, a repository generic speaker is used to model all utterances and all spea...
متن کاملEvolutive Speaker Segmentation using a Repository System
When performing blind speaker segmentation one of the main problems is not knowing how many speakers appear in a conversation and wether they appear once or more than once. In this paper, an iterative method, which is based on the EvolutiveHMM is presented. Two main improvements to this system are introduced. On one hand, a repository generic speaker is used to model all utterances and all spea...
متن کاملUnsupervised segmentation and verification of multi-speaker conversational speech
This paper presents our approach to unsupervised multispeaker conversational speech segmentation. Speech segmentation is obtained in two steps that employ different techniques. The first step performs a preliminary segmentation of the conversation analyzing fixed length slices, and assumes the presence in every slice of one or two speakers. The second step clusters the segments obtained by the ...
متن کاملAutomatic speaker clustering from multi-speaker utterances
Blind clustering of multi-person utterances by speaker is complicated by the fact that each utterance has at least two talkers. In the case of a two-person conversation, one can simply split each conversation into its respective speaker halves, but this introduces error which ultimately hurts clustering. We propose a clustering algorithm which is capable of associating each conversation with tw...
متن کاملSpeaker Segmentation Based on Subsegmental Features and Neural Network Models
In this paper, we propose an alternate approach for detecting speaker changes in a multispeaker speech signal. Current approaches for speaker segmentation employ features based on characteristics of the vocal tract system and they rely on the dissimilarity between the distributions of two sets of feature vectors. This statistical approach to a point phenomenon (speaker change) fails when the gi...
متن کامل